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c/3 ; Abstract 

. y ' A classical model for social-influence-driven opinion change is the threshold model. Here we study cascades of 

^' opinion change driven by threshold model dynamics in the case where multiple initiators trigger the cascade, 

r~| . and where all nodes possess the same adoption threshold (p. Specifically, using empirical and stylized models 

Qh" of social networks, we study cascade size as a function of the initiator fraction p. We find that even for 

arbitrarily high value of (p, there exists a critical initiator fraction Pc{4>) beyond which the cascade becomes 

global. Network structure, in particular clustering, plays a significant role in this scenario. Similarly to the 

case of single-node or single-clique initiators studied previously, we observe that community structure within 

ff^ ' the network facilitates opinion spread to a larger extent than a homogeneous random network. Finally, we 

^^ . study the efficacy of different initiator selection strategies on the size of the cascade and the cascade window. 



^': 

O ■ Introduction 
en ; 

. . ■ It has long been known through empirical studies that in a population of socially interacting individuals 
.5^ . where each individual node holds an opinion from a binary set, a small fraction of initiators holding opinion 
^ ', opposite to the one held by the majority can trigger large cascades and eventually result in a dominant 
majority holding the initiators' opinion. Some recent studies have investigated such phenomena in the 
context of the adoption of scientific, cultural, and commercial products [H [2]. One of the simplest models 
that captures adoption dynamics, irrespective of context, is the threshold model [H HJ [U [6]. According to 
the threshold model, an individual changes its opinion only if a critical fraction of its neighbors have already 
adopted the new opinion. This required fraction of new adoptees in the neighborhood is designated the 
adoption threshold. Here we denote the adoption threshold by (p. Since its introduction [3], the threshold 
model has been studied extensively on complex networks to analyze the conditions under which a vanishingly 
small fraction (of the total system size) of initiators is capable of triggering a cascade of opinion change 
[HEllT]. In particular, these studies considered initial conditions with a single "active" node [1] or an active 
connected clique (a single node and all of its neighbors) [6] as initiators. In this scenario, the condition for 
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global cascades in connected sparse random networks is < l/{k) [HEllT], where {k) is the average degree 
of the network. However, with the exception of [8], little attention has been paid to the question of how 
the size and the selection of this initiator fraction affects the spreading of an opinion in the network, in 
particular, in the regime where a single active node or clique is insufficient to trigger global cascades. 

In case of multiple initiators, how to select these initiators from among the nodes of the network so as 
to maximize the spread (cascade size), remains an open question. To address this issue we compare three 
different heuristic ways of selecting a set of initiators with predefined size, on Erdos-Renyi (ER) random 
networks [9]. Specifically, we look at the size of the spread for a varying range of the average degree {k) 
of the ER networks. As found earlier for the case of cascades triggered by single initiators [H E], we find 
that when the average degree is too low or too high, large cascades are not triggered. However, within an 
intermediate range of (A;), large cascades are realized. This range is referred to as the cascade window [5]. We 
find that the width of this cascade window is the largest when the initiator nodes are selected successively 
in descending order of degree starting with the node having the largest degree. We also find that the total 
time taken for the cascade to terminate is shortest for this selection strategy. 

In both ER [4] and empirical [6] networks it was observed that for a given {k), there is a critical threshold 
(j)c such that cascades are only triggered if < 0c for a single-node or a very small initiator set [U [6] . Here, 
we systematically study the effect of varying the initiator fraction p with (j) held fixed, for the entire range of 
values of the adoption threshold cj). We find that for any given threshold (j) <\ there exists a critical value 
of the fraction of initiators pc, above which global cascades can be triggered. We discuss the dependence of 
Pc on (j) which turns out to be a smooth curve separating the two phases, one in which cascades are observed 
and the other where cascades cannot be triggered. This finding constitutes an important insight into how 
local neighborhood-level thresholds can constrain the emergence of tipping points for cascades on global 
scales on sparse graphs. We note that in Ref. [8], the authors went beyond basic heuristic selections of the 
initiators (targets) by employing a systematic greedy selection; however they did not explore the region for 
global [0{N)\ cascades (and the corresponding tipping point pc of initiators required to trigger them), but 
rather, only focused on the p <C 1 regime. 

Details of the network structure beyond the average degree (A;), also play an important role in the 
spreading process [10]. The network's degree distribution and the presence of community structure and 
local clustering can significantly affect the dynamics of spread [6l[TT]. To elucidate the effect of clustering, 
we study the effect of network rewiring on the cascades triggered by different methods. Specifically, starting 
from an empirical network with a community structure and relatively high clustering, we redistribute the 
links in the network while preserving the original degree sequence, using a number of different methods. The 
cascade sizes are found to be larger and more likely in the original network which, in addition to having an 
inherent community structure, has much higher clustering coefficient (essentially capturing the density of 
triads) [TO]. These results indicate that local clustering, just like in the case of a single node (or single-clique) 
initiator [6] , facilitates the spreading of global cascades in the case of multiple initiators as well. 

Results 

In the threshold model, every node in the network can be in one of the two possible states, (inactive) or 1 
(active), that can be also be thought of as signifying distinct binary opinions on an issue. The typical initial 
condition for studying threshold model dynamics is one where all nodes except a minority - the initiators - 
are in state 0. Then, the dynamics proceeds as follows. At each time step, a node is selected at random. If 
the node is inactive, it becomes active if at least a threshold fraction (/> of its neighboring nodes are active 
i.e. in state 1. The active state is assumed to be permanent i.e. once a node becomes active it remains 
active indefinitely. The system evolves according to these rules until no further activations can occur. The 
threshold 0, in general, can be different for every node but for simplicity, we consider the case where every 



node has the same threshold. The size of the cascade at any point during its evohition or after it has 
terminated, is quantified by the fraction of active nodes in the network. In the following sections we discuss 
the simulation of this dynamics for various network topologies. 

Selection strategies 

The decision that a node will adopt 1 depends only on the states of its neighbors. If the fraction of its 
neighborhood in state 1 exceeds (p then the node updates its state. As a result of this threshold condition a 
node's degree plays an important role in determining how easily it can be influenced. The threshold condition 
is more easily satisfied for a low-degree node than a high-degree node, since the former requires fewer active 
nodes to be present than the latter, given a fixed adoption threshold (p for all nodes. Similarly, the average 
degree of the network (k) determines to what extent, if at all, the entire network can be influenced. For 
a fixed number of initiators, high degree nodes are less likely to get influenced because it is more difficult 
for their neighbors to satisfy the threshold condition. A high (k) is therefore not a desirable condition for 
cascades. On the other hand, for low (k), the network consists of disconnected clusters of sizes less than 
0{N), and cascades remain confined to one or few of these clusters. As a result, global cascades only become 
possible in an intermediate range of {k) - the cascade window. In general, cascade window sizes depend on 
both, the threshold (j), and the initiator fraction p. 

The precise choice of initiators also plays an important role in the size of the cascade and consequently 
the cascade window itself. A strategic selection of initiators can dramatically increase the average size of the 
spread, which we denote by S. Here, we compare three heuristic strategies for selecting a set of initiators 
constituting a fraction p of the total network size: (i) random selection, (ii) selecting nodes in the descending 
order of their degrees, and (iii) selection in the descending order of /c-shell index |12] . 

The simulation results are shown in Fig. (Ha) for a fixed fraction of initiators p = 0.01 on an ER graph 
with N = 1000 and (p = 0.18. We first look at the average spread size as a function of average degree (fc) on 
an ER random graph as shown in Figure [^a). When (k) is small, all three strategies perform equally well 
because the network consists only of small clusters without a giant component and hence spread is localized 
to those clusters. As soon as (k) becomes large enough for a giant component to arise, the spread covers a 
large portion of the network. Further increasing {k) makes it harder for the nodes to satisfy the threshold 
condition and S decreases again. 

To understand the differences in the performance of these heuristics, we first note that there are two 
distinct aspects determining the efficacy of a node as an initiator. First, it must be capable of influencing a 
large number of nodes, i.e. it should have a large degree. Second, it must be connected to nodes which have an 
easily satisflable threshold condition i.e. the degrees of its neighbors must be sufficiently low. Additionally, 
and related to the first point, it also makes sense to choose the highest-degree nodes as initiators, since they 
are the hardest to influence. In light of these arguments, the highest-degree selection strategy appears to 
be a natural choice for generating large cascades. It would appear that high fc-shell nodes are a comparably 
good choice, since high /c-shell nodes also possess a high degree. However, by construction, nodes in the 
highest A;-shells are a special subset of the high-degree nodes that are predominantly connected to other 
nodes of high-degree. In other words, nodes selected in descending order of their fc-shell index have fewer 
easily influencable neighbors than nodes selected purely on the basis of degree. This qualitatively explains 
why the /c-shell method does not perform as well as the high-degree selection. Finally, the random selection 
works the poorest since it largely selects low-degree nodes which trigger a small number of cascades many 
of which frequently terminate when they encounter a high-degree node. 

An increase in the initiator fraction p makes the cascade window wider by allowing cascades to occur 
for even higher (k) values as shown in Fig. WO^) where p is increased to 0.02. The selection strategies follow 
the same ranking in this case as well. 

Results obtained from simulations indicate that highest degree method also works better (followed by 



the /c-shell method) in terms of the speed of the cascade. The results for p = 0.01 and p = 0.02 are shown 
in Figs. [II^c) and (d), respectively. 

Tipping point for multiple initiators 

As discussed in the previous section, for a small (O(l)-size) seed of initiators, cascades can only occur if cf) 
is smaller than a critical value {(p < l/{k) for sparse random graphs [ll[6l[7])- However, this does not hold 
if we introduce a sufficiently large fraction of initiators in the system. We look at the quantity S (average 
fraction of nodes in state 1) as a function of p. (We will refer to S as cascade size for short.) Gradually 
increasing p shows that in the beginning when p <^ 1, (global) cascades are not observed. When p reaches 
a critical value pc, a discontinuous transition occurs and large cascades are seen immediately as shown in 
Fig. [2]J^a). The need for a minimum critical fraction of committed nodes for consensus has been observed in 
different models of influence [131 [T^ [T5] (see Discussion for more details) . 

Since starting with a finite p itself accounts for a large number of nodes in state 1, the relevant quantity 
to look at is the number of nodes that were initially in state and eventually adopted state 1 (i.e., excluding 
the initiators). Thus, we define 

S = f^, (1) 

1 — p 

which measures the fraction of non-initiator nodes that participate in the cascade. Transitions in S are 
shown in Fig. [2]^b) for different 4> values and several network sizes. It can clearly be seen that the transition 
only depends upon (p and is independent of system size N. This transition (the emergence of the tipping 
point) is quite generic in the threshold model, and can be observed in networks with different sizes and 
average degrees, as well as for different selection methods for initiators (see Supplementary Information 
Sections S.l and S.2 for more details). 

The critical point pc in each case is calculated by numerically computing the derivative of S with respect 
to p and finding its maximum. Having calculated pc allows us to explicitly look at the relationship between 
Pc and (f) as shown in Fig. [3] for different average degrees {k). As (k) increases, all curves appear to converge 
to the limiting case of the fully-connected network (complete graph) for which pc = (j). Therefore, for a 
given threshold (p the minimum number of initiators needed to trigger large cascades can be estimated. 

Impact of network structure and clustering 

In this section we study how the dynamics of the threshold model is affected by structural changes in 
the network. We study the dynamics on an empirical high-school friendship network, using one particular 
network from the Add Health data set [IB] (also employed in [11]) and a few degree-sequence preserving 
randomized versions of it. To simplify things, we extract the giant component from the high-school network 
which has N = 921 nodes and {k) ~ 5.96. Hereafter, we only consider the giant component of this network 
and refer to it as the high-school network. The initiator fraction is kept fixed at p = 0.01. The network 
contains two communities which are roughly equal in size. We generate two distinct ensembles of networks 
from this high-school network by employing the following randomization methods: 

1. The link swap method (henceforth referred to as x-swap) in which two links are selected at random 
and then one end point of a link is swapped with the end point of the other link. An x-swap step is 
disallowed if it results in fragmentation of the network. This swapping is done repeatedly so that the 
network is randomized to an extent that any community structure, local clustering, or degree-degree 
correlation is eliminated [171 [T8| [19] . 

2. The exact sampling method by Del Genio et al. (DKTB) [20], a connected network is constructed from 
the degree sequence of the original network. The algorithm takes as input the exact degree sequence 



of the network and joins the link stubs from different nodes until every stub has been paired with 
another stub [2(11 [2T] . 

Both methods of randomization leave the degree sequence unchanged. We look at the size of spread S 
as a function of time for p = 0.01 in the original high-school network Fig. [Hj^a), the x-swapped high-school 
network Fig. ID^b), and the network generated by the exact sampling method Fig. W[c). Analogous plots 
for p = 0.02 are shown in Figs. Hl^d-f). Different black trajectories correspond to different individual runs 
and the average is shown by a thicker red curve. For the empirical high-school network, some runs reveal 
the community structure where spread is faster in one community compared to other. It can also be seen 
visually for a typical run in Fig. [5)^a), while randomized networks do not show this behavior, see Figs.[5]^b,c). 
In general, the results show that triggered cascades are larger and more likely for a network with high local 
clustering than for a randomized network with the same degree sequence [Fig. [3] . Note that the clustering 
coefficient of the original high-school (HS) graph is Chs ~ 0.125; for its randomized versions obtained 
by x-swaps (XS) and exact-degree sequence (DKTB [20]) construction are Cxs ~ Cdktb ~ 0.008 (see 
Supplementary Information Section S.3 for more details). 

The average cascade size S [Fig. [6ja) and (b)] and the probability of global cascades Pc [Fig. Ws) and (d)] 
as a function of threshold (j) also indicate that strong clustering (present in empirical networks) facilitates 
threshold-limited spreading. (We define a global cascade as a cascade that covers at least 60% percent of 
the network size A^.) Hence, this important feature of threshold-limited spreading [6] is preserved for the 
case of multiple initiators studied here. 

The temporal evolution of the average cascade size in the original HS network, its two randomized 
versions, and an ER network of the same size and with the same average degree is shown in Fig. [71 The 
two methods of randomization (x-swap and exact sampling) roughly give the same cascade size S. In 
case of randomized networks, for some realizations spread reaches the full network [Fig. [5|^c)] and for some 
realizations spread is minuscule [Fig. [5|^b)] and therefore S <\. 

Finally, analogous to Fig. [21 we show the emergence of global cascades (at the tipping point pc) in the 
high-school network, as the density of initiators is varied [Fig. [8] . 

Discussion 

Several recent studies have addressed, for a variety of agent-based opinion spreading models, the impact of 
a special set of initiators viz. inflexible individuals [13], also referred to synonymously as committed agents 
[lIl[I3l[l5l[22l[23l[Ml[25l[26], true believers [27], zealots [Ml [Ml [30] , or inflexible contrarians [3ll[32]. The 
rules of state updating (or opinion switching) in these models is symmetric, and governed purely by the local 
density of states in the neighborhood of a node. In such a system, the inflexible nodes constitute a special 
set of nodes which never change their opinion, thereby breaking the symmetry of the system and giving rise 
to tipping points beyond which the entire network conforms to the state adopted by the committed agents. 
It has been shown that the emergence of tipping points in these models is related to metastable regions and 
barriers (saddle points) in the corresponding opinion landscapes [141 [22l [23] . Because these models allow 
frequent changes of state or opinion at the individual level, these models are more suitable for scenarios 
where switching an individual's state incurs virtually no cost. 

In contrast to the above models, the threshold model (or the qualitatively similar threshold contact 
process [3311341 [35l I36| ) is more suited to modeling the diffusion of innovations or adoption of new products 
where investment in a new idea comes at a cost, and the incentive to switch back after becoming active 
is low. Here, spreading is an asymmetric process and is also inhibited by a local threshold: individuals 
can only adopt the new product or norm if a sufficient fraction of their neighbors have already done so. 
(The threshold model or threshold contact process, in spirit, is closer to the family of Susceptible- Infected- 
Susceptible- or contact-process- like models [121 Ell [Ml [Ml [301 [lH [42] , in that the spreading of a disease or 



norm is an inherently asymmetric process by the rules of the local dynamics.) 

The focus of this work was to identify tipping points for global cascades triggered by multiple initiators 
and governed by local thresholds. Our findings demonstrate that these tipping points emerge in both ER 
and empirical high-school networks, in a qualitatively similar fashion. 

Further, we studied three different heuristic strategies to select a fraction of initiators for the threshold 
model on ER network as well as on an empirical network. Our results demonstrate that selecting initiators 
by their degree (highest first) results in the largest (as well as fastest) spread. Naturally, for high values of 
the local threshold ((/> > l/(fc)), single initiators or small cliques cannot trigger global cascades. We showed 
by simulations that there exists a critical value of initiator fraction pc that is needed to trigger cascades for 
high values of (j)- We also studied how structural changes, such as randomizing an empirical network using 
different randomizing methods, would affect the size of the cascades triggered (in the cases studied here) by 
multiple initiators. Our simulation results on the empirical high-school network show that randomizing the 
network in fact results in narrower cascade windows compared to the original network with strong clustering, 
implying that clustering facilitates spreading in threshold-limited diffusion with multiple initiators. 
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Figure 1: Cascade size S as a function of the average degree on ER networks of A^ = 1000 nodes with 
threshold (j) = 0.18 for different selection strategies of multiple initiators for (a) p = 0.01; for (b) p = 0.02. 
Time evolution of the average cascade size S on ER networks of A^ = 1000 nodes with average degree 
{k) = 6.0 and threshold (j) = 0.18 for different selection strategies of multiple initiators for (c) p = 0.01; for 
(d) p = 0.02. 




Figure 2: Cascade size and scaled cascade size as a function of initiators on ER networks with {k) = 10.0. 

(a) Cascade size S* as a function of initiators p for ER networks with N = 10000 for different values of cj). 

(b) Scaled cascade size S [Eq. ([1])] vs. p for ER networks with different network sizes N and cj) values. 
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Figure 3: Critical fraction of initiators for global cascades pc as a function of the local threshold- value cj) for 
ER networks of size N = 5000 with various values of the average degree. The dashed line corresponds to 
the exact limiting case on large complete graphs (fully-connected networks), pc — (t)- 
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Figure 4: Time evolution of the size of the cascades S on the high-school (HS) network and its two ran- 
domized versions with identical degree sequence (X-swapped and exact sampling [20]), with A^ = 921, 
(k) = 5.96, and (/> = 0.18 for two different values effraction of initiators, (a) high-school friendship network, 
(b) x-swapped, (c) exact sampling for p = 0.01; (d) high-school friendship network, (e) x-swapped, (f) exact 
sampling for p = 0.02. Depending upon the distribution of initiators, some runs result in cascade while some 
do not. Blue curves represent conditional average over the runs for which spread reaches the full network 
and red curves correspond to average over all runs. 
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Figure 5: Visualizations of spreading in the threshold model (typical individual runs) for various networks 
at different times during evolution (arrow on top indicates the direction of time evolution). A^ = 921, 
(k) = 5.96, p = 0.01 and (p = 0.18. Nodes in state 1 (active nodes) are colored red. (a) Original high-school 
network; (b) Randomized network (by X-Swap) when eventual spread is local; and (c) The same randomized 
network but for a run that reaches the whole network. 
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Figure 6: Cascade size and probability of global cascades in the high-school (HS) network and its two 
randomized versions with identical degree sequence (X-swapped and exact sampling [20]) with A^ = 921 and 
{k) = 5.96. Cascade size 5 as a function of (f) for (a) p = 0.01 and for (b) p = 0.02. Probability of global 
cascades Pc as a function of (j) for (c) p = 0.01 and for (d) p = 0.02. 
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Figure 7: Average cascade size as a function of time in the high-school network (HS), in its two randomized 
versions with identical degree sequence (x-swapped and exact sampling [20]), and in ER networks with the 
same average degree. A^ = 921, [k) = 5.96, p = 0.01 and (j) = 0.18 in all cases. 
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Figure 8: Cascade size and scaled cascade size as a function of initiators on the high-school network [N = 921, 
{k) = 5.96). (a) Cascade size S" as a function of initiators p for different values of (p. (b) Scaled cascade size 
S [Eq. ([1])] vs. p for different values of 4>. 
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